Skip to content

feat: Add CreateVectorIndexReference WriterTarget and SPI methods (#27261)#27261

Closed
skyelves wants to merge 2 commits intoprestodb:masterfrom
skyelves:export-D95325176
Closed

feat: Add CreateVectorIndexReference WriterTarget and SPI methods (#27261)#27261
skyelves wants to merge 2 commits intoprestodb:masterfrom
skyelves:export-D95325176

Conversation

@skyelves
Copy link
Copy Markdown
Member

@skyelves skyelves commented Mar 5, 2026

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

  • CreateVectorIndexReference: plan-time target carrying index metadata
    and source table reference
  • beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
    NOT_SUPPORTED so connectors must opt in
  • ClassLoaderSafeConnectorMetadata: delegation wrappers

Release Notes

== NO RELEASE NOTE ==

Differential Revision: D95325176

@skyelves skyelves requested review from a team, feilong-liu and jaystarshot as code owners March 5, 2026 04:48
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai bot commented Mar 5, 2026

Reviewer's Guide

Introduces a new CREATE VECTOR INDEX SQL statement end-to-end (parser, AST, analyzer, formatter, and tests) plus a new TableWriterNode.CreateVectorIndexReference and connector SPI hooks so connectors can implement vector index creation with their own metadata and layouts.

Sequence diagram for CREATE VECTOR INDEX execution path

sequenceDiagram
    actor User
    participant Frontend as PrestoFrontend
    participant Parser as SqlParser
    participant Analyzer as StatementAnalyzer
    participant Planner as PlanBuilder
    participant Exec as ExecutionEngine
    participant Meta as ConnectorMetadata

    User->>Frontend: submit "CREATE VECTOR INDEX ..."
    Frontend->>Parser: parse SQL
    Parser-->>Frontend: CreateVectorIndex AST

    Frontend->>Analyzer: analyze(CreateVectorIndex)
    Analyzer->>Analyzer: validate source and target tables
    Analyzer->>Analyzer: validate columns exist in source table
    Analyzer->>Analyzer: validate properties
    Analyzer->>Frontend: setCreateVectorIndexAnalysis(CreateVectorIndexAnalysis)
    Analyzer-->>Frontend: analyzed statement

    Frontend->>Planner: build plan
    Planner->>Planner: create TableWriterNode
    Planner->>Planner: create CreateVectorIndexReference
    Planner-->>Frontend: physical plan

    Frontend->>Exec: schedule query(plan)

    loop for each connector fragment
        Exec->>Meta: beginCreateVectorIndex(session, indexMetadata, layout, sourceTableName)
        Meta-->>Exec: ConnectorOutputTableHandle
        Exec->>Meta: finishCreateVectorIndex(session, tableHandle, fragments, statistics)
        Meta-->>Exec: Optional ConnectorOutputMetadata
    end

    Exec-->>Frontend: index creation completed
    Frontend-->>User: success response
Loading

Class diagram for new CREATE VECTOR INDEX types and SPI hooks

classDiagram
    class Statement

    class CreateVectorIndex {
        +CreateVectorIndex(Identifier indexName, QualifiedName tableName, List~Identifier~ columns, Optional~Expression~ updatingFor, List~Property~ properties)
        +CreateVectorIndex(NodeLocation location, Identifier indexName, QualifiedName tableName, List~Identifier~ columns, Optional~Expression~ updatingFor, List~Property~ properties)
        +Identifier getIndexName()
        +QualifiedName getTableName()
        +List~Identifier~ getColumns()
        +Optional~Expression~ getUpdatingFor()
        +List~Property~ getProperties()
    }

    CreateVectorIndex --|> Statement

    class Analysis {
        -Optional~CreateVectorIndexAnalysis~ createVectorIndexAnalysis
        +void setCreateVectorIndexAnalysis(CreateVectorIndexAnalysis analysis)
        +Optional~CreateVectorIndexAnalysis~ getCreateVectorIndexAnalysis()
    }

    class CreateVectorIndexAnalysis {
        +CreateVectorIndexAnalysis(QualifiedObjectName sourceTableName, QualifiedObjectName targetTableName, List~Identifier~ columns, Map~String, Expression~ properties, Optional~Expression~ updatingFor)
        +QualifiedObjectName getSourceTableName()
        +QualifiedObjectName getTargetTableName()
        +List~Identifier~ getColumns()
        +Map~String, Expression~ getProperties()
        +Optional~Expression~ getUpdatingFor()
    }

    Analysis *-- CreateVectorIndexAnalysis

    class WriterTarget {
        <<abstract>>
        +ConnectorId getConnectorId()
        +SchemaTableName getSchemaTableName()
        +Optional~List~OutputColumnMetadata~~ getOutputColumns()
    }

    class CreateVectorIndexReference {
        -ConnectorId connectorId
        -ConnectorTableMetadata tableMetadata
        -Optional~NewTableLayout~ layout
        -Optional~List~OutputColumnMetadata~~ columns
        -SchemaTableName sourceTableName
        +CreateVectorIndexReference(ConnectorId connectorId, ConnectorTableMetadata tableMetadata, Optional~NewTableLayout~ layout, Optional~List~OutputColumnMetadata~~ columns, SchemaTableName sourceTableName)
        +ConnectorId getConnectorId()
        +ConnectorTableMetadata getTableMetadata()
        +Optional~NewTableLayout~ getLayout()
        +SchemaTableName getSchemaTableName()
        +Optional~List~OutputColumnMetadata~~ getOutputColumns()
        +SchemaTableName getSourceTableName()
    }

    CreateVectorIndexReference --|> WriterTarget

    class ConnectorMetadata {
        <<interface>>
        +ConnectorOutputTableHandle beginCreateVectorIndex(ConnectorSession session, ConnectorTableMetadata indexMetadata, Optional~ConnectorNewTableLayout~ layout, SchemaTableName sourceTableName)
        +Optional~ConnectorOutputMetadata~ finishCreateVectorIndex(ConnectorSession session, ConnectorOutputTableHandle tableHandle, Collection~Slice~ fragments, Collection~ComputedStatistics~ computedStatistics)
    }

    class ClassLoaderSafeConnectorMetadata {
        -ConnectorMetadata delegate
        -ClassLoader classLoader
        +ConnectorOutputTableHandle beginCreateVectorIndex(ConnectorSession session, ConnectorTableMetadata indexMetadata, Optional~ConnectorNewTableLayout~ layout, SchemaTableName sourceTableName)
        +Optional~ConnectorOutputMetadata~ finishCreateVectorIndex(ConnectorSession session, ConnectorOutputTableHandle tableHandle, Collection~Slice~ fragments, Collection~ComputedStatistics~ computedStatistics)
    }

    ClassLoaderSafeConnectorMetadata ..|> ConnectorMetadata
    ClassLoaderSafeConnectorMetadata o-- ConnectorMetadata
Loading

File-Level Changes

Change Details Files
Add CREATE VECTOR INDEX syntax, AST node, and end-to-end parsing/formatting support including error handling tests.
  • Extend the SQL grammar with CREATE VECTOR INDEX, VECTOR/INDEX/UPDATING tokens, and relaxed properties list to allow an optional trailing comma.
  • Implement the CreateVectorIndex AST node, including visitor hooks in AstVisitor and DefaultTraversalVisitor.
  • Add AstBuilder support to construct CreateVectorIndex nodes with index name, table name, column list, optional UPDATING FOR expression, and properties.
  • Extend SqlFormatter to pretty-print CREATE VECTOR INDEX statements with WITH properties and UPDATING FOR clauses.
  • Add positive and negative parser tests for CREATE VECTOR INDEX and update error messages to include VECTOR where appropriate.
presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4
presto-parser/src/main/java/com/facebook/presto/sql/parser/AstBuilder.java
presto-parser/src/main/java/com/facebook/presto/sql/SqlFormatter.java
presto-parser/src/main/java/com/facebook/presto/sql/tree/CreateVectorIndex.java
presto-parser/src/main/java/com/facebook/presto/sql/tree/AstVisitor.java
presto-parser/src/main/java/com/facebook/presto/sql/tree/DefaultTraversalVisitor.java
presto-parser/src/test/java/com/facebook/presto/sql/parser/TestSqlParser.java
presto-parser/src/test/java/com/facebook/presto/sql/parser/TestSqlParserErrorHandling.java
Wire CREATE VECTOR INDEX into analysis and query-type classification so it is treated as an INSERT-like operation and validated semantically.
  • Add Analysis.CreateVectorIndexAnalysis to carry source/target table names, columns, properties, and optional UPDATING FOR expression.
  • Track an optional createVectorIndexAnalysis field in Analysis with setters/getters.
  • Implement StatementAnalyzer.visitCreateVectorIndex to resolve and validate source and target tables, verify referenced columns exist, validate properties, add access control for table creation, and populate CreateVectorIndexAnalysis.
  • Classify CreateVectorIndex statements as QueryType.INSERT in StatementUtils.
presto-analyzer/src/main/java/com/facebook/presto/sql/analyzer/Analysis.java
presto-analyzer/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java
presto-analyzer/src/main/java/com/facebook/presto/sql/analyzer/utils/StatementUtils.java
Add a CreateVectorIndexReference WriterTarget and corresponding ConnectorMetadata SPI methods so connectors can create vector indexes with data.
  • Introduce TableWriterNode.CreateVectorIndexReference WriterTarget carrying connectorId, index table metadata, optional layout, optional output columns, and the source table name.
  • Extend ConnectorMetadata with beginCreateVectorIndex and finishCreateVectorIndex default methods that throw NOT_SUPPORTED unless overridden by connectors.
  • Add ClassLoaderSafeConnectorMetadata delegating implementations of the new beginCreateVectorIndex and finishCreateVectorIndex methods with classloader safety.
presto-spi/src/main/java/com/facebook/presto/spi/plan/TableWriterNode.java
presto-spi/src/main/java/com/facebook/presto/spi/connector/ConnectorMetadata.java
presto-spi/src/main/java/com/facebook/presto/spi/connector/classloader/ClassLoaderSafeConnectorMetadata.java

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 5, 2026

Codenotify: Notifying subscribers in CODENOTIFY files for diff 5022f6b...37b2e82.

No notifications.

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 3 issues, and left some high level feedback:

  • The CREATE VECTOR INDEX grammar rule currently restricts the column list to at most two identifiers (identifier (',' identifier)?), which conflicts with the tests that use three columns and beyond; this should be changed to allow an arbitrary number of columns (e.g., identifier (',' identifier)*).
  • The grammar for CREATE VECTOR INDEX only supports ON TABLE qualifiedName, but the parser tests expect both ON t(a, b) and ON TABLE t(a, b) forms; consider updating the rule to make TABLE optional so the parser and tests are aligned.
  • In DefaultTraversalVisitor.visitCreateVectorIndex, the tableName is not traversed (only indexName, columns, updatingFor, and properties are processed); if any visitors rely on visiting the table name, you should add process(node.getTableName(), context) for consistency with other statement visitors.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `CREATE VECTOR INDEX` grammar rule currently restricts the column list to at most two identifiers (`identifier (',' identifier)?`), which conflicts with the tests that use three columns and beyond; this should be changed to allow an arbitrary number of columns (e.g., `identifier (',' identifier)*`).
- The grammar for `CREATE VECTOR INDEX` only supports `ON TABLE qualifiedName`, but the parser tests expect both `ON t(a, b)` and `ON TABLE t(a, b)` forms; consider updating the rule to make `TABLE` optional so the parser and tests are aligned.
- In `DefaultTraversalVisitor.visitCreateVectorIndex`, the `tableName` is not traversed (only `indexName`, `columns`, `updatingFor`, and `properties` are processed); if any visitors rely on visiting the table name, you should add `process(node.getTableName(), context)` for consistency with other statement visitors.

## Individual Comments

### Comment 1
<location path="presto-parser/src/main/antlr4/com/facebook/presto/sql/parser/SqlBase.g4" line_range="172-173" />
<code_context>
     | MERGE INTO qualifiedName (AS? identifier)?
-        USING relation ON expression mergeCase+                        #mergeInto
+          USING relation ON expression mergeCase+                        #mergeInto
+    | CREATE VECTOR INDEX identifier ON TABLE qualifiedName
+          '(' identifier (',' identifier)? ')'
+          (WITH properties)?
+          (UPDATING FOR booleanExpression)?                              #createVectorIndex
</code_context>
<issue_to_address>
**issue (bug_risk):** Column list grammar only allows at most two columns instead of an arbitrary list

This grammar only parses one or two identifiers, but the rest of the pipeline treats `columns` as an arbitrary list. To avoid rejecting valid statements with more than two columns, update it to allow a repeated list, e.g. ` '(' identifier (',' identifier)* ')'`.
</issue_to_address>

### Comment 2
<location path="presto-main-base/src/main/java/com/facebook/presto/sql/analyzer/StatementAnalyzer.java" line_range="1173-1175" />
<code_context>
+
+            Map<String, Expression> allProperties = mapFromProperties(node.getProperties());
+
+            analysis.addAccessControlCheckForTable(TABLE_CREATE,
+                    new AccessControlInfoForTable(accessControl, session.getIdentity(),
+                            session.getTransactionId(), session.getAccessControlContext(), targetTable));
+
+            analysis.setCreateVectorIndexAnalysis(new Analysis.CreateVectorIndexAnalysis(
</code_context>
<issue_to_address>
**🚨 suggestion (security):** Access control covers target table creation but not reading from the source table

You’ve added a `TABLE_CREATE` check on the target, but there’s no corresponding check for reading the source table, even though the index build will scan it. To match other DDL+data operations (CTAS, refresh materialized view), consider adding a source-table access control check (e.g., `SELECT` on `sourceTableName`).

Suggested implementation:

```java
            Map<String, Expression> allProperties = mapFromProperties(node.getProperties());

            analysis.addAccessControlCheckForTable(TABLE_SELECT,
                    new AccessControlInfoForTable(accessControl, session.getIdentity(),
                            session.getTransactionId(), session.getAccessControlContext(), sourceTableName));

            analysis.addAccessControlCheckForTable(TABLE_CREATE,
                    new AccessControlInfoForTable(accessControl, session.getIdentity(),
                            session.getTransactionId(), session.getAccessControlContext(), targetTable));

            analysis.setCreateVectorIndexAnalysis(new Analysis.CreateVectorIndexAnalysis(

```

1. Ensure that `TABLE_SELECT` is imported or statically referenced in this file in the same way as `TABLE_CREATE`. If it is not yet imported, add the appropriate import or static import alongside the existing `TABLE_CREATE` import.
2. Confirm that `sourceTableName` is the correct object to use for access control in this context (it should match the type used elsewhere for table access checks, typically the same type as `targetTable` or other `SELECT` checks in this class). If a different variable (e.g., `sourceTable`) is used for table access checks elsewhere in the `StatementAnalyzer`, replace `sourceTableName` with that variable for consistency.
</issue_to_address>

### Comment 3
<location path="presto-parser/src/test/java/com/facebook/presto/sql/parser/TestSqlParser.java" line_range="1384" />
<code_context>
     }

+    @Test
+    public void testCreateVectorIndex()
+    {
+        // Basic CREATE VECTOR INDEX
</code_context>
<issue_to_address>
**suggestion (testing):** Add formatter/round‑trip tests for CREATE VECTOR INDEX to ensure ASTSQL consistency, especially around optional TABLE, WITH, and UPDATING FOR clauses

The parser tests cover valid/invalid forms well, but nothing exercises SqlFormatter’s new visitCreateVectorIndex implementation. Please add round‑trip/formatting tests (like the existing CREATE TABLE ones) to verify:

- `CREATE VECTOR INDEX idx ON t(a, b)` is formatted as expected (the formatter always prints `ON TABLE`)
- Properties are emitted with correct commas/spacing
- UPDATING FOR expressions (equality, BETWEEN, complex boolean) format correctly
- Formatting is stable under parse → format → parse cycles

This will catch divergences between parser and formatter or syntactically invalid formatted SQL.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment on lines +172 to +173
| CREATE VECTOR INDEX identifier ON TABLE qualifiedName
'(' identifier (',' identifier)? ')'
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (bug_risk): Column list grammar only allows at most two columns instead of an arbitrary list

This grammar only parses one or two identifiers, but the rest of the pipeline treats columns as an arbitrary list. To avoid rejecting valid statements with more than two columns, update it to allow a repeated list, e.g. '(' identifier (',' identifier)* ')'.

}

@Test
public void testCreateVectorIndex()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (testing): Add formatter/round‑trip tests for CREATE VECTOR INDEX to ensure AST ↔ SQL consistency, especially around optional TABLE, WITH, and UPDATING FOR clauses

The parser tests cover valid/invalid forms well, but nothing exercises SqlFormatter’s new visitCreateVectorIndex implementation. Please add round‑trip/formatting tests (like the existing CREATE TABLE ones) to verify:

  • CREATE VECTOR INDEX idx ON t(a, b) is formatted as expected (the formatter always prints ON TABLE)
  • Properties are emitted with correct commas/spacing
  • UPDATING FOR expressions (equality, BETWEEN, complex boolean) format correctly
  • Formatting is stable under parse → format → parse cycles

This will catch divergences between parser and formatter or syntactically invalid formatted SQL.

@skyelves skyelves force-pushed the export-D95325176 branch 3 times, most recently from 7cf44a9 to fe33047 Compare March 5, 2026 06:42
skyelves added a commit to skyelves/presto that referenced this pull request Mar 5, 2026
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers

Differential Revision: D95325176
skyelves added a commit to skyelves/presto that referenced this pull request Mar 5, 2026
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers

Differential Revision: D95325176
@skyelves skyelves force-pushed the export-D95325176 branch 2 times, most recently from 1679e01 to 54201c0 Compare March 5, 2026 07:54
@steveburnett
Copy link
Copy Markdown
Contributor

Please add a release note entry following the Release Notes Guidelines.

skyelves added a commit to skyelves/presto that referenced this pull request Mar 5, 2026
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers

Differential Revision: D95325176
skyelves added a commit to skyelves/presto that referenced this pull request Mar 5, 2026
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers

Differential Revision: D95325176
skyelves added a commit to skyelves/presto that referenced this pull request Mar 5, 2026
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers

Differential Revision: D95325176
@skyelves skyelves force-pushed the export-D95325176 branch 3 times, most recently from 3d1b664 to 4277277 Compare March 11, 2026 02:17
skyelves added a commit to skyelves/presto that referenced this pull request Mar 13, 2026
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers

Differential Revision: D95325176
skyelves added a commit to skyelves/presto that referenced this pull request Mar 13, 2026
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers

Differential Revision: D95325176
@meta-codesync meta-codesync bot changed the title feat: Add CreateVectorIndexReference WriterTarget and SPI methods feat: Add CreateVectorIndexReference WriterTarget and SPI methods (#27261) Mar 13, 2026
skyelves added a commit to skyelves/presto that referenced this pull request Mar 13, 2026
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers

Differential Revision: D95325176
skyelves added a commit to skyelves/presto that referenced this pull request Mar 13, 2026
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers

Differential Revision: D95325176
skyelves added a commit to skyelves/presto that referenced this pull request Mar 13, 2026
…estodb#27261)

Summary:
Pull Request resolved: prestodb#27261

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers

Differential Revision: D95325176
@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla bot commented Mar 13, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

skyelves added a commit to skyelves/presto that referenced this pull request Mar 13, 2026
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers

Differential Revision: D95325176
@gggrace14 gggrace14 self-requested a review March 16, 2026 23:05
gggrace14
gggrace14 previously approved these changes Mar 16, 2026
@feilong-liu
Copy link
Copy Markdown
Contributor

looks like that this PR has some conflict/overlap with #27307 ?

skyelves added a commit to skyelves/presto that referenced this pull request Mar 17, 2026
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers


## Release Notes
Please follow release notes guidelines and fill in the release notes below.
```
  == RELEASE NOTES ==
  General Changes
  * Add support for create-vector-index statement, which creates
    vector search indexes on table columns with configurable index properties
    and partition filtering via an ``UPDATING FOR`` clause.
```

Differential Revision: D95325176
@skyelves
Copy link
Copy Markdown
Member Author

looks like that this PR has some conflict/overlap with #27307 ?

Thanks, I just rebased

@skyelves skyelves force-pushed the export-D95325176 branch 2 times, most recently from ac616af to 77c9555 Compare March 18, 2026 17:24
skyelves added a commit to skyelves/presto that referenced this pull request Mar 18, 2026
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers


## Release Notes
```
== NO RELEASE NOTE ==
```

Differential Revision: D95325176
@skyelves skyelves force-pushed the export-D95325176 branch 2 times, most recently from 72016b4 to 196aaaf Compare March 19, 2026 00:57
Summary:
## High level design
The process for executing a CREATE VECTOR INDEX SQL statement is as follows:
1. SQL Input & Parsing:

SQL: CREATE VECTOR INDEX my_index ON my_table(id, embedding) WITH (...) UPDATING FOR ...
The Parser (SqlBase.g4) generates a CreateVectorIndex Abstract Syntax Tree (AST) node.
2. Statement Analysis:

**StatementAnalyzer.visitCreateVectorIndex() validates the source/target tables and extracts index properties.**
**This results in a structured CreateVectorIndexAnalysis object.**

3. Logical Planning & Query Generation:
• LogicalPlanner.createVectorIndexPlan() builds the core execution query:
CREATE index_table AS SELECT create_vector_index(embedding, id) FROM my_table WHERE ds BETWEEN ...
• The resulting plan tree includes:

TableFinishNode(target = CreateVectorIndexReference)
└── TableWriterNode(target = CreateVectorIndexReference)
└── query plan
4. Connector Plan Optimization (Rewriting):

PRISM: The CreateVectorIndexRewriteOptimizer detects the CreateVectorIndexReference and rewrites the plan for optimization.
ICEBERG/OTHER: Other connector-specific optimizers may fire during this phase.
5. Execution and Metadata Handling (For connectors that don't rewrite):

TableWriteInfo Routing: The CreateVectorIndexReference triggers metadata.beginCreateVectorIndex().
Local Execution & Commit: The finisher and committer use the CreateVectorIndexHandle to call metadata.finishCreateVectorIndex() and metadata.commitPageSinkAsync().
6. ConnectorMetadata SPI:

Default: The standard implementation throws NOT_SUPPORTED.
Iceberg Override: The Iceberg connector implements this SPI to create the underlying table via the begin/finish calls.


## Release Notes
```
== NO RELEASE NOTE ==
```



Differential Revision: D91524358

Pulled By: skyelves
…estodb#27261)

Summary:

Add dedicated WriterTarget subclass and ConnectorMetadata SPI for
CREATE VECTOR INDEX, enabling each connector to implement vector index
creation independently.

- CreateVectorIndexReference: plan-time target carrying index metadata
  and source table reference
- beginCreateVectorIndex/finishCreateVectorIndex: SPI defaults to
  NOT_SUPPORTED so connectors must opt in
- ClassLoaderSafeConnectorMetadata: delegation wrappers


## Release Notes
```
== NO RELEASE NOTE ==
```

Differential Revision: D95325176
@NivinCS
Copy link
Copy Markdown
Contributor

NivinCS commented Mar 19, 2026

@skyelves , Could you please update the PR to include only the changes related to CreateVectorIndexReference WriterTarget and the SPI methods? It looks like the analysis support code changes are also included in this PR, which seems to be a duplication since we already have a separate PR for those changes.

@skyelves skyelves closed this Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants